# 컴퓨터구조

이성원교수님

Project #4

학과 : 컴퓨터 정보 공학부

학번 : 2019202103

이름 : 이은비

제출 날짜 : 2023.06.15

#### <실험 내용>

정렬 프로그램과 Benchmarks 프로그램에 각각 적합한 cache를 찾고, 각각의 프로그램의 적합한 cache 가 왜 다른 건지 프로그램을 분석하여 비교합니다. 아래 testbench를 통해 비교할 bubble sort와 random sort의 알고리즘을 보면 bubble sort란 정렬 알고리즘 중 가장 기본적이고 간단한 알고리즘이며 버블 정렬은 두 인접한 원소를 검사하여 서로의 값을 교환하며 정렬하는 방법인데 오름차순 정렬의 경우 두 항목의 값을 비교하여 앞쪽 값이 더 크면 서로 위치를 교환하고, 내림차순의 경우 그 반대입니다. Random sort는 말 그대로 random하게 sort되는 것을 알 수 있으며 제공된 프로젝트 자료를 참고하면 bubble sort와 random sort를 비교할 수 있으며 각 testbench를 보면 다음과 같습니다. 바로 아래 내용은 Bubble sort입니다.



Data가 각각의 IM과 DM(cache)로 hit하면서 write된 것을 알 수 있습니다.



Main memory read가 진행된 것을 알 수 있습니다.



두 항목 간의 비교를 하여야 되므로 2개씩, 즉 2개씩 비교하므로 그에 따른 cache write를 진행하는 것을 알 수 있습니다.

아래 random sort의 testbench를 보면



위의 bubble sort와 동일하게 처음에 cache에 write하는 과정은 동일하게 진행됩니다.





bubble sort에서는 2개씩 비교하는 과정이 있을 때는 하나씩 띄어서 instruction을 write했었는데 random sort에서는 그 또한 random하게 write하는 과정을 거치는 것을 알 수 있습니다.

<검증 전략, 분석 및 결과>

Cache의 기본 hierarchy를 보면



L1 cache는 instruction cache와 data cache 로 separate 할 수 있고, L2 cache는 unified cache memory는 unified한 구조인 것을 참 고 하여 아래 내용을 simulation할 수 있습니 다.

◆ Unified cache / Separate cache

위의 경우에 해당하는 AMAT식은 다음과 같습니다.

AMAT = %instr x (instr hit time + instr miss rate x instr miss penalty) +
%data x (data hit time + data miss rate x data miss penalty)

아래 simulation의 과정의 예시입니다. I1을 unified cache로 작성한 것입니다.

```
-cache:il1 il1:64:16:1:l
-cache:dl1 none
-cache:il2 none
-cache:dl2 none
-tlb:itlb none
-tlb:dtlb none
```

| sim: ** simulation s | tatistics **                                         | sim: ** simulation               | statistics **                                                                                |
|----------------------|------------------------------------------------------|----------------------------------|----------------------------------------------------------------------------------------------|
| sim num insn         | 119392269 # total number of instructions executed    | sim_num_insn                     | 29241771 # total number of instructions executed                                             |
| sim num refs         | 44164172 # total number of loads and stores executed | sim_num_refs<br>sim elapsed time | 8043255 # total number of loads and stores executed 2 # total simulation time in seconds     |
|                      |                                                      | sim_inst_rate                    | 14620885.5000 # simulation speed (in insts/sec)                                              |
| sim_elapsed_time     | 7 # total simulation time in seconds                 | il1.accesses                     | 29241771 # total number of accesses                                                          |
| sim_inst_rate        | 17056038.4286 # simulation speed (in insts/sec)      | il1.hits                         | 21681208 # total number of hits                                                              |
| il1.accesses         | 119392269 # total number of accesses                 | il1.misses                       | 7560563 # total number of misses                                                             |
| il1.hits             | 80660601 # total number of hits                      | il1.replacements                 | 7560531 # total number of replacements                                                       |
|                      |                                                      | il1.writebacks                   | 0 # total number of writebacks                                                               |
| il1.misses           | 38731668 # total number of misses                    | il1.invalidations                | 0 # total number of invalidations                                                            |
| il1.replacements     | 38731604 # total number of replacements              | il1.miss_rate<br>il1.repl rate   | <pre>0.2586 # miss rate (i.e., misses/ref) 0.2586 # replacement rate (i.e., repls/ref)</pre> |
| il1.writebacks       | 0 # total number of writebacks                       | il1.wb rate                      | 0.0000 # writeback rate (i.e., wrbks/ref)                                                    |
| il1.invalidations    | 0 # total number of invalidations                    | il1.inv rate                     | 0.0000 # invalidation rate (i.e., invs/ref)                                                  |
|                      |                                                      | dl1.accesses                     | 8146520 # total number of accesses                                                           |
| il1.miss_rate        | 0.3244 # miss rate (i.e., misses/ref)                | dl1.hits                         | 6051951 # total number of hits                                                               |
| il1.repl_rate        | 0.3244 # replacement rate (i.e., repls/ref)          | dl1.misses                       | 2094569 # total number of misses                                                             |
| il1.wb rate          | 0.0000 # writeback rate (i.e., wrbks/ref)            | dl1.replacements                 | 2094537 # total number of replacements                                                       |
| -                    |                                                      | dl1.writebacks                   | 461897 # total number of writebacks                                                          |
| il1.inv_rate         | 0.0000 # invalidation rate (i.e., invs/ref)          | dl1.invalidations                | 0 # total number of invalidations                                                            |
| ld_text_base         | 0x00400000 # program text (code) segment base        | dl1.miss_rate<br>dl1.repl rate   | <pre>0.2571 # miss rate (i.e., misses/ref) 0.2571 # replacement rate (i.e., repls/ref)</pre> |
| ld text size         | 2166768 # program text (code) size in bytes          | dl1.wb rate                      | 0.0567 # writeback rate (i.e., wrbks/ref)                                                    |
| ld data base         | 0x10000000 # program initialized data segment base   | dl1.inv_rate                     | 0.0000 # invalidation rate (i.e., invs/ref)                                                  |
|                      |                                                      | ul2.accesses                     | 10117029 # total number of accesses                                                          |
| ld data size         | 264644 # program init'ed `.data' and uninit'ed       | 12 hite                          | 0022402 # total number of hite                                                               |

## I1을 separate한 cache로 작성한 것입니다. Data cache와 instruction cache로 작성한 것입니다.

-cache:dl1 dl1:32:16:1:l
-cache:dl2 none #
-cache:il1 il1:32:16:1:l
none}
-cache:il2 none #
dl2|none}

sim: \*\* simulation statistics \*\*

```
sim num refs
                                                                                                          8043255 # total number of loads and stores executed
sim num insn
                        119392269 # total number of instructions executed
                         44164172 # total number of loads and stores executed sim_elapsed_time
                                                                                                               2 # total simulation time in seconds
sim num refs
                                                                              sim_inst_rate
                                                                                                     14620885.5000 # simulation speed (in insts/sec)
sim elapsed time
                                7 # total simulation time in seconds
                                                                              il1.accesses
                                                                                                         29241771 # total number of accesses
                      17056038.4286 # simulation speed (in insts/sec)
sim inst rate
                                                                              il1.hits
                                                                                                         21681208 # total number of hits
il1.accesses
                         119392269 # total number of accesses
                                                                              il1.misses
                                                                                                         7560563 # total number of misses
il1.hits
                         73761455 # total number of hits
                                                                              il1.replacements
                                                                                                         7560531 # total number of replacements
il1.misses
                         45630814 # total number of misses
                                                                              il1.writebacks
                                                                                                               0 # total number of writebacks
                         45630782 # total number of replacements
il1.replacements
                                                                              il1.invalidations
                                                                                                               0 # total number of invalidations
il1.writebacks
                                0 # total number of writebacks
                                                                              il1.miss_rate
                                                                                                           0.2586 # miss rate (i.e., misses/ref)
                                0 # total number of invalidations
il1.invalidations
                                                                              il1.repl rate
                                                                                                          0.2586 # replacement rate (i.e., repls/ref)
il1.miss rate
                           0.3822 # miss rate (i.e., misses/ref)
                                                                              il1.wb_rate
                                                                                                           0.0000 # writeback rate (i.e., wrbks/ref)
il1.repl rate
                           0.3822 # replacement rate (i.e., repls/ref)
                                                                                                          0.0000 # invalidation rate (i.e., invs/ref)
                                                                              il1.inv rate
il1.wb rate
                           0.0000 # writeback rate (i.e., wrbks/ref)
                                                                              dl1.accesses
                                                                                                          8146520 # total number of accesses
                                                                              dl1.hits
il1.inv rate
                           0.0000 # invalidation rate (i.e., invs/ref)
                                                                                                         6051951 # total number of hits
                                                                              dl1.misses
                                                                                                          2094569 # total number of misses
dl1.accesses
                          44513362 # total number of accesses
                                                                              dl1.replacements
                                                                                                          2094537 # total number of replacements
dl1.hits
                         34150595 # total number of hits
                                                                              dl1.writebacks
                                                                                                           461897 # total number of writebacks
                         10362767 # total number of misses
dl1.misses
                                                                              dl1.invalidations
                                                                                                               0 # total number of invalidations
dl1.replacements
                          10362735 # total number of replacements
                                                                              dl1.miss_rate
                                                                                                           0.2571 # miss rate (i.e., misses/ref)
dl1.writebacks
                          4581715 # total number of writebacks
                                                                              dl1.repl rate
                                                                                                           0.2571 # replacement rate (i.e., repls/ref)
dl1.invalidations
                                0 # total number of invalidations
                                                                              dl1.wb_rate
                                                                                                           0.0567 # writeback rate (i.e., wrbks/ref)
                           0.2328 # miss rate (i.e., misses/ref)
dl1.miss rate
                                                                              dl1.inv rate
                                                                                                          0.0000 # invalidation rate (i.e., invs/ref)
dl1.repl_rate
                           0.2328 # replacement rate (i.e., repls/ref)
                                                                              ul2.accesses
                                                                                                         10117029 # total number of accesses
dl1.wb rate
                           0.1029 # writeback rate (i.e., wrbks/ref)
                                                                              ul2.hits
                                                                                                         9832183 # total number of hits
                           0.0000 # invalidation rate (i.e., invs/ref)
dl1.inv rate
                                                                              ul2.misses
                                                                                                           284846 # total number of misses
ld text base
                        0x00400000 # program text (code) segment base
                                                                                                           284590 # total number of replacements
                                                                              ul2.replacements
ld text size
                          2166768 # program text (code) size in bytes
                                                                              ul2.writebacks
                                                                                                           36822 # total number of writebacks
                        0x10000000 # program initialized data segment base
                                                                                                              0 # total number of invalidations
ld_data_base
                                                                              ul2.invalidations
                                                                                                           ^ ^^^^ # _!__ _!__ /! _ _!___!\__
```

| <traces7< th=""><th></th><th></th><th></th><th></th><th></th></traces7<> |               |                          |                 |                |            |
|--------------------------------------------------------------------------|---------------|--------------------------|-----------------|----------------|------------|
| # of Sets Unified cache<br>Miss rate                                     | Unified cache | Unified<br>cache<br>AMAT | Split           | Split cache    |            |
|                                                                          | Miss rate     |                          | Inst. Miss rate | Data Miss rate | AMAT       |
| 64                                                                       | 0.32440683    | 1,10523                  | 0.38219237      | 0.23280055     | 1.20026989 |
| 128                                                                      | 0.26866009    | 6.01218                  | 0.32440683      | 0.19308457     | 1.13519851 |
| 256                                                                      | 0.23193162    | 6.05378                  | 0.26866 116     | 0.11305403     | 6.08496235 |
| 512                                                                      | 0.16395908    | 6.02688                  | 0.23193162      | 0.07642519     | 1.05962383 |

실습강의 자료를 참고하여 작성한 표입니다. Instruction miss rate가 data miss rate보다는 크며 split cache가 비교적 amat가 큰 것으로 판단 했습니다.

위의 조건에서는 testbench끼리의 비교도 가능했는데 내용은 다음과 같습니다.

| sim: ** simulation sim_num_insn sim_num_refs sim_elapsed_time sim_inst_rate il1.accesses il1.hits il1.misses il1.replacements il1.writebacks il1.invalidations il1.miss_rate il1.repl_rate il1.wb_rate il1.inv_rate dl1.accesses dl1.hits dl1.misses dl1.replacements dl1.writebacks dl1.replacements dl1.writebacks dl1.replacements dl1.writebacks dl1.invalidations dl1.miss_rate dl1.inv_rate dl1.inv_rate dl1.tepl_rate dl1.wb_rate dl1.inv_rate ld_text_base ld_text_size ld_data_base | 119392269<br>44164172<br>7<br>17056038.4286<br>119392269<br>73761455<br>45630814<br>45630782<br>0<br>0<br>0.3822<br>0.3822<br>0.0000<br>0.0000<br>44513362<br>34150595<br>10362767<br>10362735 | #ill.hits  #ill.misses #ill.replacements #ill.writebacks #ill.invalidations #ill.miss_rate #ill.repl_rate #ill.wb_rate #ill.inv_rate #dll.accesses #dll.hits #dll.misses #dll.replacements #dll.writebacks #dll.writebacks #dll.invalidations #dll.miss_rate #dll.wropl_rate #dll.wb_rate #dll.vropl_rate #dll.inv_rate | 8043255 # |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|

# ♦ L1 cache size / L2 cache size

왼쪽은 cc1이고 오른쪽은 ijpeg으로 64 split cache size에 해당합니다.

```
sim: ** simulation statistics **
 sim_num_insn 119392269 sim: ** simulation statistics **
                                 44164172 sim_num_insn 29241771
7 sim_num_refs 8043255
 sim_num_refs

    sim_elapsed_time
    7 sim_num_rers
    8043233

    sim_inst_rate
    17056038.4286 sim_inst_rate
    2

    ill.accesses
    119392269 ill_accesses
    14620885.5000

                              119392269 il1.accesses
 il1.accesses
                                                                              29241771
21681208
                                 80660601 il1.hits
38731668 il1.misses
 il1.hits
ill.misses 38731668 ill.misses ill.replacements 38731604 ill.replacements ill.writebacks 0 ill.writebacks ill.invalidations 0 ill.invalidations ill.miss_rate 0.3244 ill.miss_rate ill.repl_rate 0.3244 ill.repl_rate
                                                                                      7560563
                                                                                     7560531
                                                                                              0
                                                                                              0
0.2586
                                                                                       0.0000
                                                                                      0.0000
                                                                                     8146520
                                                                                     6051951
                                                                                     2094569
                                                                                      2094537
                                                                                      461897
                                                                                     0.2571
                                                                                     0.2571
0.0567
0.0000
                                                                                   10117029
                                                                                     9832183
                                                                                       284590
ld data base 0x10000000 ul2.writebacks
                                                                                         36822
```

#### 왼쪽은 cc1이고 오른쪽은 ijpeg으로 128 split cache size에 해당합니다.

```
        sim: ** simulation statistics **
        # sim_num_insn
        119392269
        # sim: ** simulation statistics **

        sim_num_refs
        44164172
        # sim_insn
        29241771 ft
        ** sim_insn
        29241771 ft
        ** sim_insn
        29241771 ft
        ** sim_insn
        29241771 ft
        ** sim_insn
        39241771 ft
        ** sim_insn
        $* sim_insn
        $* sim_insn
        $* sim_insn
```

왼쪽은 cc1이고 오른쪽은 ijpeq으로 512 split cache size에 해당합니다.

### L1,L2cache로 hierarchy를 나누어 작성한 것입니다.

| L1I/L1D/L2U | Inst.Miss rate | Data.Miss rate | Unified Cache<br>Miss rate | AMAT       |
|-------------|----------------|----------------|----------------------------|------------|
| 8/8/1024    | 0.4572         | 0.2571         | 0.2603                     | 0,275/3225 |
| 16/16/512   | 0.42494437     | 0.302/3/13     | 0.0281551                  | 0.21183286 |
| 32/32/256   | 0.38219137     | 0.23280055     | 0.0281551                  | 0.20026488 |
| 64/64/128   | 0.32440683     | 0.17308457     | 0.028/55/                  | 0.13519851 |
| 128/128/0   | 0.26866116     | 0.11305403     | ()                         | 0.08496    |

cache사이즈가 클수록 miss rate, AMAT모두 낮아지는 것을 확인할 수 있었습니다.

◆ Large block size / Small block size

Block size가 커질수록 Miss rate와 AMAT 모두 감소하는 것을 알 수 있습니다.

```
sim: ** simulation statistics **
                             119392269 # total number of instructions executed
44164172 # total number of loads and stores executed
sim_num_insn
sim_num_refs
sim_elapsed_time
                                            6 # total simulation time in seconds
                              19898711.5000 # simulation speed (in insts/sec)
sim_inst_rate
                                119392269 # total number of accesses
99816822 # total number of hits
il1.accesses
il1.hits
il1.misses
                                   19575447 # total number of misses
il1.replacements
                                  19574935 # total number of replacements
                                            0 # total number of writebacks
0 # total number of invalidations
il1.writebacks
il1.invalidations
il1.miss_rate
                                     0.1640 # miss rate (i.e., misses/ref)
                              0.1640 # replacement rate (i.e., repls/ref)
0.0000 # writeback rate (i.e., wrbks/ref)
0.0000 # invalidation rate (i.e., invs/ref)
0x00400000 # program text (code) segment base
2166768 # program text (code) size in bytes
il1.repl_rate
il1.wb_rate
il1.inv_rate
ld_text_base
ld text size
                               0x10000000 # program initialized data segment base
264644 # program init'ed `.data' and uninit'ed `.bss' size
ld_data_base
ld_data_size
in bytes
ld_stack_base
                               0x7fffc000 # program stack segment base (highest address in
stack)
                                       16384 # program initial stack size
ld_stack_size
                               0x00400140 # program entry point (initial PC)
0x7fff8000 # program environment base address address
ld_prog_entry
ld_environ_base
ld_target_big_endian
                                            0 # target executable endian-ness, non-zero if big
endian
mem.page_count
                                          806 # total number of pages allocated
                                       3224k # total size of memory pages allocated
mem.page_mem
mem.ptab_misses
                                        816 # total first level page table misses
                                579990118 # total page table accesses
0.0000 # first level page table miss rate
mem.ptab_accesses
mem.ptab_miss_rate
```

| Block size | Unified cache<br>Miss rate | AMAT       |
|------------|----------------------------|------------|
| 16         | 0.1640                     | 1,22306084 |
| 64         | 0.0322                     | 1,03430942 |
| 128        | 0.0136                     | 1,01399238 |
| 256        | 0,0067                     | 1.00679015 |
| 512        | 0.0042                     | 1,00423521 |

## ◆ Direct-mapped / Set-Associative

associative way가 늘어날수록 miss-rate와 AMAT가 감소하는 것을 알 수 있으며,

Set의 개수가 늘어날수록 miss-rate와 AMAT가 감소하는 것을 알 수 있습니다.

```
sim: ** simulation statistics **
sim_num_insn
                         119392269 # total number of instructions executed
sim_num_refs
                          44164172 # total number of loads and stores executed
                                 7 # total simulation time in seconds
sim_elapsed_time
sim_inst_rate
                      17056038.4286 # simulation speed (in insts/sec)
                        119392269 # total number of accesses
il1.accesses
il1.hits
                          80660601 # total number of hits
il1.misses
                          38731668 # total number of misses
il1.replacements
                          38731604 # total number of replacements
                                 0 # total number of writebacks
il1.writebacks
il1.invalidations
                                  0 # total number of invalidations
il1.miss_rate
                             0.3244 # miss rate (i.e., misses/ref)
il1.repl_rate
                             0.3244 # replacement rate (i.e., repls/ref)
                             0.0000 # writeback rate (i.e., wrbks/ref)
il1.wb rate
                            0.0000 # invalidation rate (i.e., invs/ref)
il1.inv_rate
                       0x00400000 # program text (code) segment base
ld text base
ld_text_size
                            2166768 # program text (code) size in bytes
ld_data_base
                        0x10000000 # program initialized data segment base
                             264644 # program init'ed `.data' and uninit'ed `.bss' size
ld data size
in bytes
ld stack base
                        0x7fffc000 # program stack segment base (highest address in
stack)
ld_stack_size
                             16384 # program initial stack size
                         0x00400140 # program entry point (initial PC)
ld_prog_entry
ld environ base
                         0x7fff8000 # program environment base address address
ld_target_big_endian
                                 0 # target executable endian-ness, non-zero if big
endian
                                806 # total number of pages allocated
mem.page_count
                              3224k # total size of memory pages allocated
mem.page_mem
mem.ptab misses
                                816 # total first level page table misses
mem.ptab_accesses
                          579990118 # total page table accesses
mem.ptab miss rate
                            0.0000 # first level page table miss rate
```

| # of Sets | Split Cache<br>Miss rate / AMAT |       |        |        |        |        |        |        |
|-----------|---------------------------------|-------|--------|--------|--------|--------|--------|--------|
|           | <b>1</b> -w                     | vay   | 2-v    | vay    | 4-1    | way    | 8-v    | vay    |
| 64        | 0.3244                          | 1.585 | 0.2640 | 1.410  | 7.2055 | 1.300  | 1.1348 | 1.174  |
| 128       | 0.2681                          | 1.439 | 0.2047 | 1.218  | 0.1363 | 1.1741 | 0.2319 | 1.1739 |
| 256       | 0. 23 9                         | 1,363 | 0.1506 | 1.199  | 0.0147 | 1.084  | 0.0318 | 1.0657 |
| 512       | 0.1640                          | 1,223 | 0.0874 | 1.183  | 0.035  | 1,114  | 0.011  | (١٥١)  |
| 1024      | 0.1190                          | 1.149 | 0.0523 | 1.058  | 0.0196 | 1.020  | 0.0084 | 1,017  |
| 2048      | 0.0727                          | 1.084 | 0.0265 | 1.0299 | 0.0012 | 1,0094 | 0.0022 | 1,002  |

# <문제점 및 고찰>

miss rate가 작으려면 여러 조건들에 해당할 때 miss rate와 block size를 키운다는 것은 special locality 를 고려하는 것이라고 볼 수 있는데 이때 큰 cache가 좋음. associative는 way가 많아질수록 하드웨어 cost가 증가한다는 것을 알 수 있습니다. 작은 cache일 때 효과는 점점 줄어들고 하드웨어 리소스는 커지는 것을 알 수 있습니다. associativity가 작으면 LRU방식을, 크면 Random방식을 주로 사용합니다.

그리고 프로젝트 스펙에 나와있는 benchmark가 어떤 조건에서는 둘 중하나만 작성 되어있고 또 어떤 조건에서는 그에 대한 파일이 없다고 하여서 제대로 수행해 보지는 못했습니다.